18.785: Analytic Number Theory, MIT, spring 2007 (K.S. Kedlaya) 
Revisiting the sieve of Eratosthenes 



This unit begins the second part of the course, in which we will investigate a class of 
methods in analytic number theory known as sieves. (For non-native speakers of English: 
in ordinary life, a sieve is a device through which you pour a powder, like flour, to filter 
out large impurities.) Whereas the first part of the course leaned heavily on methods from 
complex analysis, here the emphasis will be more combinatorial. 



1 The Sieve of Eratosthenes 

The original sieve is of course the Sieve of Eratosthenes for finding prime numbers. To 
find the prime numbers in {2, . . . , n}, you repeat the following operation as long as there 
are unmarked numbers: find the first unmarked number p, mark it as prime, then mark 
2p, 3p, . . . as composite until you get to a number greater than n. 

Of course, one need only sift out multiples of primes up to n 1 ^ 2 in order to leave only 
primes behind. More generally, if one is only able to sift out multiples of primes up to n a , 
what remain are numbers with no prime factors less than n a . In particular, any such number 
has at most [a _1 J prime factors, and so is in some sense "nearly prime". 

Of course, in the process of sieving, many numbers will be sifted out more than once. 
If one wants to draw any sort of quantitative conclusion from this process, one must keep 
track of the multiple counting; this suggests using inclusion-exclusion. 



2 The principle of inclusion-exclusion 

Let S be a finite set, and let Pi, . . . , P n be subsets of S. Think of each P as containing the 
elements of S with a certain property. 

Suppose we have some way to count the number of elements in the intersection of any 
subcollection of the Pi, but what we really want is to count the complement of the union of 
all of the Pi. The formula that computes this is: 

#(5\(P 1 U..-UP B ))= Yi (-!) #r # ( n P * ) ■ 

rc{l,..,n} \t£T J 

Proof: if s e S belongs to m of the subsets, then the number of times it gets counted on the 
right side is 

' m\ I m " 



oy vi. ' + 

which equals 1 if m — and vanishes otherwise. 
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More generally, if / : S — > C is some function, and we want to compute the sum of / 
over the complement of the Pi, we have 

E /(«)= E (-W E /(«) 

seS\(PiU-uP„) TC{i,..., n } \sen teT P t 

In number theory, we are often taking S — {1, . . . , iV} and taking the sets P±, P2, ... to 
be the sets of multiples of certain small primes. It is convenient to rewrite the principle of 
inclusion-exclusion in terms of the arithmetic function \i, the Mobius function: 




(-l) d n = p 1 ---p d (pi, ... ,p d distinct, d > 0) 
otherwise. 




3 Smooth numbers 

Before proceeding, I need a quick lemma concerning smooth numbers. A natural number is 
z-smooth if its prime factors are all less than or equal to z. 

Lemma 1 (Rankin). Let $(x,z) be the number of z-smooth numbers less than or equal to 
x. Then for any 5 > 0, 

$(x,z) < x 5 Y[(l -p' 5 )' 1 . 

Proof. If we expand the right side as a product of geometric series, we get a term (x/n) 5 > 1 
for each z-smooth number n < x (among other terms). This yields the claim. □ 



4 Back to Eratosthenes 

Here is a modern version of the Sieve of Eratosthenes, following Murty and Saradha. Let A 
be a set of natural numbers, and let P be a set of primes; also set 

p(z)= n p- 

peP,p<z 

For each p e P, choose a set R p consisting of some number uj(p) of residue classes modulo 
p, and let A p be the subset of A whose elements belong to the chosen residue classes. Put 



w(z) = n (, _ »m\ 



p\P{z) 

For d squarefree with all prime factors in P, put u(d) = Ylp^^ip) an d = ^\ P \dAp- 
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We wish to estimate S(A, P, z), the number of elements of A not belonging to A p for any 
p < z. For this, we must assume some good properties about the chosen residue classes. For 
starters, we want that for some k > 0, 

£ ^2|P< Klogz + 0(1) , (1) 

^— ' p 

P<z,peP 

where the big-0 bound is for z — > oo and the constant depends on P, R p , k. 
Lemma 2. Assuming (1), w;e nave 

^ c(rf)=o(t(log^) K exp(- 10gt 



log 2 



where the big-0 bound is for z — > oo and t/ie constant depends on P, R p , k. 

Proof. Exercise. □ 

Lemma 3. Fix C > 0. Assuming (1), we acwe 



E ^. ( (log ^ eX p(-g)) 



<2>Cai,<2|P(z) 

where the big-0 bound is for z — > oo and t/ie constant depends on P, R p , k, C . 
Proof. Put F w (i,,z) = Ed<t,d|Pww(d)- Then 

v «M<rMA dt (2) 

' a /fv * 

d>C:r,d|P(z) JCx 

(exercise), so the result follows from Lemma 2. □ 

Theorem 4. Fix P, R p , k satisfying (1), and a/so fix C,c> 0. Tnen /or any set A and any 
X, x > suc/i that 



#A d - "-fx 



< ao(d) 



and #Ad = for d > Cx, we have 



S(A, P, z) = XW(z) + 0[x log K+1 z exp 



log a; 
log z 

where the big-0 bound is for z — > oo, uniformly in A, x, X . 

Proof. Exercise. □ 
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5 Motivation: the twin prime conjecture 



The twin prime conjecture states that there are infinitely many primes p such that p + 2 is 
also prime. One can even guess the correct asymptotic up to a constant factor, by a very 
simple argument: since the probability of a random number in [1,...,JV] being prime is 
asymptotically 1/ log TV, the number of twin primes in [1, . . ., N] should be asymptotic to 
N/ log 2 N. (Getting the constant right is a bit trickier; I won't deal with that just now.) 

As a corollary of Theorem 4, we obtain the following result of Brun (with a slightly 
simpler proof). 

Theorem 5. The number of primes p < x suchthatp+2 is also prime is 0(x(\og log x) 2 /(log x) 2 ). 

Proof. We will apply Theorem 4 with A = {1, . . . , x} and P = {p : 2 < p < z}. For each 
p E P, let R p consist of the residue classes of 0, —2, so that uo(p) = 2. For d odd squarefree, 
uo(d) = 2 u( ~ (t) for v(d) the number of prime factors of d. One checks easily (exercise) that 



d 



< T {d \ (3) 



Since 



-— = 0(logz) 



from a prior homework, we can take k = 2 in Theorem 4. This yields 
S(A, P, z) = xW(z) + 0[x log 3 z exp ^ ! ' >g ' 



log z 

where the big-0 constant does not depend on x or z. We now take 

log a; 

lo £ z = ~Ti — \ 

A log log x 

for a suitable constant A. Since 



w{z)< n (i-iY=o((iog^- 2 ) 



3<p<z 

by a prior homework exercise, we deduce that S(A, P, z) = 0(x(logloga;) 2 /(logx) 2 ). 

To conclude, note that S(A, P, z) includes all primes z + 2 < p < x such that p + 2 is 
also prime. The number of twin primes up to x that we missed is at most z = x l ^ AlogXogx \ 
so this doesn't affect the claim. □ 

We will get a sharper result using Selberg's sieve in a subsequent lecture. 
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Exercises 

1. Prove Lemma 2 using Rankin's trick. 

2. Prove (2). 

3. Prove Theorem 4. 

4. Prove (3). 

5. (Brun) Prove that the sum of the reciprocals of the twin primes converges. 

6. Prove that 

&(x, z) — O ( a; logzexp ( — - - 

V V log^ 

where the big-0 bound is for z — > oo, uniformly in x. (Hint: apply Rankin's lemma 
with 5 = 1- (log*) -1 .) 

7. Prove that the number of squarefree integers in {1, . . . , iV} is 

^-N + OiN 1 -*) 

for some explicit value of e. (Hint: this is much easier than sieving over primes! Just 
make sure to round round no more than 0(iV 1_e ) fractions off to the nearest integer. 
Also, don't forget that 6/tt 2 = 1/C(2) = [T (1 - 1/V)0 
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